Noise robustness in speech to speech translation
نویسندگان
چکیده
This paper describes various noise robustness issues in a speech-to-speech translation system. We present quantitative measures for noise robustness in the context of speech recognition accuracy and speech-to-speech translation performance. To enhance noise immunity, we explore two approaches to improve the overall speech-to-speech translation performance. First, a multi-style training technique is used to tackle the issue of environmental degradation at the acoustic model level. Second, a pre-processing technique, CDCN, is exploited to compensate for the acoustic distortion at the signal level. Further improvement can be obtained by combining both schemes. In addition to recognition accuracy for speech recognition, this paper studies and examines how closely speech recognition accuracy is related the overall speech-to-speech recognition. When we apply the proposed schemes to an English-to-Chinese translation task, the word error rate for our speech recognition subsystem is substantially reduced by 28% relative, to 13.2% from 18.9% for test data of 15dB SNR. The corresponding BLEU score improves to 0.478 from 0.43 for the overall speech-to-speech translation. Similar improvements are also observed for a lower SNR condition.
منابع مشابه
Noise Robustness in Speech To
This paper describes various noise robustness issues in a speech-to-speech translation system. We present quantitative measures for noise robustness in the context of speech recognition accuracy and speech-to-speech translation performance. To enhance noise immunity, we explore two approaches to improve the overall speech-to-speech translation performance. First, a multi-style training techniqu...
متن کاملReliability of Interaural Time Difference-Based Localization Training in Elderly Individuals with Speech-in-Noise Perception Disorder
Background: Previous studies have shown that interaural-time-difference (ITD) training can improve localization ability. Surprisingly little is, however, known about localization training vis-à-vis speech perception in noise based on interaural time difference in the envelope (ITD ENV). We sought to investigate the reliability of an ITD ENV-based training program in speech-in-noise perception a...
متن کاملA Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement
A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...
متن کاملThe Effect of Private Speech and Self-Regulation on Translation Quality among Iranian Translation Students: A Mixed-Methods Study
The current study presents findings from a mixed-methods study of investigating the self-regulatory role of private speech (self-talk) on students’ translation quality. The aim of the study was to validate the adapted version of a self-verbalization questionnaire. The construct validity and reliability of the scale were supported by the CFA which revealed that all items reached the acceptable f...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کامل